Interactive Schema Integration with Sphinx
نویسندگان
چکیده
The Internet has instigated a critical need for automated tools that facilitate integrating countless databases. Since non-technical end users are often the ultimate repositories of the domain information required to distinguish differences in data types, we suppose an effective solution must integrate simple GUI based data browsing tools and automatic mapping methods that eliminate technical users from the solution. We develop a meta-model of data integration as the basis for absorbing feedback from an end-user. The schema integration algorithm draws examples from the data and learns integrating view definitions by asking a user simple yes or no questions. The meta-model enables a search mechanism that is guaranteed to converge to a correct integrating view definition without the user having to know a view definition language such as SQL or even having to inspect the final view definition. We show how data catalog statistics, normally used to optimize queries, can be exploited to parameterize the search heuristics and improve the convergence of the learning algorithm.
منابع مشابه
1 SPHINX : Schema Integration by Example
We focus on the problem of semi-automated query discovery for XML views without requiring the intervention of an expert to guarantee a correct final result. Given multiple independent sources of heterogeneous XML data structures, our tool, SPHINX, lets a nontechnical user define views using example -based, graphical, interaction. SPHINX embodies a syntactically derived meta-model of federating ...
متن کامل1 Technical Report 02 - 20 SPHINX : Schema Integration by Example
We focus on the problem of semi-automated query discovery for XML views without requiring the intervention of an expert to guarantee a correct final result. Given multiple independent sources of heterogeneous XML data structures, our tool, SPHINX, lets a naïve user define views using simple, example-based interaction. We show how federating view definitions may be represented using the version-...
متن کاملSphinx: Schema-conscious Xml Indexing
User queries on XML documents are typically expressed as regular path expressions. A variety of indexing techniques for efficiently retrieving the results to such queries have been proposed in the recent literature. While these techniques are applicable to documents that are completely schema-less, in practice XML documents often adhere to a schema, such as a DTD. In this paper, we propose Sphi...
متن کاملSchema-conscious XML indexing
User queries on extensible markup language (XML) documents are typically expressed as regular path expressions. A variety of indexing techniques for efficiently retrieving the results to such queries have been proposed in the recent literature. While these techniques are applicable to documents that are completely schema-less, in practice XML documents often adhere to a schema, such as a docume...
متن کاملAn Interactive Tool Based on Xml Technology for Data Exchange between Heterogeneous Erp Systems
Data exchange between enterprise resource planning (ERP) systems in a supply chain system needs to fulfill requirements of both schema integration and message translation. Since ERP systems with relational database systems are developed independently, schema conflicts between databases is a common problem for schema integration. Thus, supply chain partners need to preserve the data integrity of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004